Elite Bases Regression: A Real-time Algorithm for Symbolic Regression

نویسندگان

  • Chen Chen
  • Changtong Luo
  • Zonglin Jiang
چکیده

Symbolic regression is an important but challenging research topic in data mining. It can detect the underlying mathematical models. Genetic programming (GP) is one of the most popular methods for symbolic regression. However, its convergence speed might be too slow for large scale problems with a large number of variables. This drawback has become a bottleneck in practical applications. In this paper, a new nonevolutionary real-time algorithm for symbolic regression, Elite Bases Regression (EBR), is proposed. EBR generates a set of candidate basis functions coded with parse-matrix in specific mapping rules. Meanwhile, a certain number of elite bases are preserved and updated iteratively according to the correlation coefficients with respect to the target model. The regression model is then spanned by the elite bases. A comparative study between EBR and a recent proposed machine learning method for symbolic regression, Fast Function eXtraction (FFX), are conducted. Numerical results indicate that EBR can solve symbolic regression problems more effectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shuffled Frog-Leaping Programming for Solving Regression Problems

There are various automatic programming models inspired by evolutionary computation techniques. Due to the importance of devising an automatic mechanism to explore the complicated search space of mathematical problems where numerical methods fails, evolutionary computations are widely studied and applied to solve real world problems. One of the famous algorithm in optimization problem is shuffl...

متن کامل

Application of Symbolic Regression on Blast Furnace and Temper Mill Datasets

This work concentrates on three different modifications of a genetic programming system for symbolic regression analysis. The coefficient of correlation R is used as fitness function instead of the mean squared error and offspring selection is used to ensure a steady improvement of the achieved solutions. Additionally, as the fitness evaluation consumes most of the execution time, the generated...

متن کامل

A Deterministic and Symbolic Regression Hybrid Applied to Resting-State fMRI Data

Symbolic regression (SR) is one the most popular applications of genetic programming (GP) and an attractive alternative to the standard deterministic regression approaches due to its flexibility in generating free-form mathematical models from observed data without any domain knowledge. However, GP suffers from various issues hindering the applicability of the technique to real-life problems. I...

متن کامل

Learn More about Your Data: A Symbolic Regression Knowledge Representation Framework

In this paper, we propose a flexible knowledge representation framework which utilizes Symbolic Regression to learn and mathematical expressions to represent the knowledge to be captured from data. In this approach, learning algorithms are used to generate new insights which can be added to domain knowledge bases supporting again symbolic regression. This is used for the generalization of the w...

متن کامل

Pursuing the Pareto Paradigm: Tournaments, Algorithm Variations & Ordinal Optimization

The ParetoGP algorithm which adopts a multi-objective optimization approach to balancing expression complexity and accuracy has proven to have significant impact on symbolic regression of industrial data due to its improvement in speed and quality of model development as well as user model selection, [1], [2], [3]. In this chapter, we explore a range of topics related to exploiting the Pareto p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.07313  شماره 

صفحات  -

تاریخ انتشار 2017